首页> 外文OA文献 >LookAhead: Augmenting Crowdsourced Website Reputation Systems With Predictive Modeling
【2h】

LookAhead: Augmenting Crowdsourced Website Reputation Systems With Predictive Modeling

机译:Lookahead:增加众包网站信誉系统   预测建模

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Unsafe websites consist of malicious as well as inappropriate sites, such asthose hosting questionable or offensive content. Website reputation systems areintended to help ordinary users steer away from these unsafe sites. However,the process of assigning safety ratings for websites typically involves humans.Consequently it is time consuming, costly and not scalable. This has resultedin two major problems: (i) a significant proportion of the web space remainsunrated and (ii) there is an unacceptable time lag before new websites arerated. In this paper, we show that by leveraging structural and content-basedproperties of websites, it is possible to reliably and efficiently predicttheir safety ratings, thereby mitigating both problems. We demonstrate theeffectiveness of our approach using four datasets of up to 90,000 websites. Weuse ratings from Web of Trust (WOT), a popular crowdsourced web reputationsystem, as ground truth. We propose a novel ensemble classification techniquethat makes opportunistic use of available structural and content properties ofwebpages to predict their eventual ratings in two dimensions used by WOT:trustworthiness and child safety. Ours is the first classification system topredict such subjective ratings and the same approach works equally well inidentifying malicious websites. Across all datasets, our classificationperforms well with average F$_1$-score in the 74--90\% range.
机译:不安全的网站包括恶意和不适当的网站,例如托管可疑或令人反感的内容的网站。网站信誉系统旨在帮助普通用户远离这些不安全的站点。但是,为网站分配安全等级的过程通常涉及人员。因此,这很耗时,成本高且不可扩展。这导致了两个主要问题:(i)很大一部分网站空间仍未评级,以及(ii)在对新网站进行评级之前存在不可接受的时间滞后。在本文中,我们表明,通过利用网站的结构和基于内容的属性,可以可靠而有效地预测其安全等级,从而缓解这两个问题。我们使用多达90,000个网站的四个数据集证明了我们方法的有效性。我们将流行的众包Web信誉系统Web of Trust(WOT)中的评分作为基本事实。我们提出了一种新颖的集成分类技术,该技术可以利用机会利用网页的可用结构和内容属性来预测WOT使用的两个维度的最终评级:可信度和儿童安全。我们的系统是第一个预测此类主观评分的分类系统,并且相同的方法在识别恶意网站方面也同样有效。在所有数据集中,我们的分类均表现良好,平均F $ _1 $得分在74--90 \%范围内。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号